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Abstract 

Entringer, Jackson, and Schatz conjectured in 1974 that every in- 
finite cubefree binary word contains arbitrarily long squares. In this 
paper we show this conjecture is false: there exist infinite cubefree 
binary words avoiding all squares xx with \x\ > 4, and the number 4 
is best possible. However, the Entringer-Jackson-Schatz conjecture is 
true if "cubefree" is replaced with "overlap-free" . 



1 Introduction 

Let E be a finite nonempty set, called an alphabet. We consider finite and 
infinite words over E. The set of all finite words is denoted by E*. The set 
of all infinite words (that is, maps from N to E) is denoted by E w . 

A morphism is a map h : E* — ► A* such that h(xy) = h(x)h(y) for all 
x, y G E*. A morphism may be specified by providing the image words h(a) 
for all a G E. If h : E* — > E* and h(a) = ax for some letter a 6 E, then we 
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say that h is prolongable on a, and we can then iterate h infinitely often to 
get the fixed point h?{a) := axh(x) h 2 (x) h 3 (x) 

A square is a nonempty word of the form XX, cLS 111 the English word 
murmur. A cube is a nonempty word of the form XXX, cLS 111 the English 
sort-of-word shshsh. An overlap is a word of the form axaxa, where x is a 
possibly empty word and a is a single letter, as in the English word alfalfa. 

It is well-known and easily proved that every word of length 4 or more 
over a two-letter alphabet contains a square as a subword. However, Thue 
proved in 1906 [4 that there exist infinite words over a three-letter alphabet 
that contain no squares; such words are said to avoid squares or be squarefree. 
Thue also proved that the word /^(O) = 0110100110010110 • ■ ■ is overlap-free 
(and hence cubefree); here \i is the morphism sending — > 01 and 1 — > 10. 

Entringer, Jackson, and Schatz [2] proved that while squares cannot be 
avoided over a two- letter alphabet, arbitrarily long squares can. More pre- 
cisely, they proved that there exist infinite binary words with no squares of 
length > 3, and that the number 3 is best possible. Later, this result was 
improved by Fraenkel and Simpson who proved that there exist infinite 
binary words where the only squares are 00, 11, and 0101. 

Entringer, Jackson, and Schatz conjectured in 1974 that any infinite cube- 
free word over {0, 1} contains arbitrarily long squares Conjecture B, p. 
163]. In this paper we show that this conjecture is false; there exist infinite 
cubefree binary words with no squares xx with \x\ > 4. The number 4 is best 
possible. Further, we show that the Entringer- Jackson-Schatz conjecture is 
true if the word "cubefree" is replaced with "overlap- free" . 

2 A cubefree word without arbitrarily long 
squares 

In this section we disprove the conjecture of Entringer, Jackson, and Schatz. 
First we prove the following result. 

Theorem 1 There is a squarefree infinite word over {0,1,2,3} with no oc- 
currences of the subwords 12, 13, 21, 32, 231, or 10302. 

Proof. Let the morphism h be defined by 

-> 0310201023 

1 -> 0310230102 

2 -> 0201031023 

3 -> 0203010201 
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Then we claim the fixed point h w (0) has the desired properties. 

First, we claim that if w G {0, 1,2,3}* then h(w) has no occurrences of 
12, 13, 21, 32, 231, or 10302. For if any of these words occur as subwords of 
h(w), they must occur within some h(a) or straddling the boundary between 
h(a) and h{b), for some single letters a, b. They do not; this easy verification 
is left to the reader. 

Next, we prove that if w is any squarefree word over {0, 1, 2, 3} having no 
occurrences of 12, 13, 21, or 32, then h(w) is squarefree. 

We argue by contradiction. Let w = a\a 2 ■ ■ ■ a n be a squarefree string such 
that h(w) contains a square, i.e., h(w) = xyyz for some x,z G {0,1,2,3}*, 
y G {0, 1, 2, 3} + . Without loss of generality, assume that w is a shortest such 
string, so that < \x\, \z\ < 10. 

Case 1: \y\ < 20. In this case we can take \w\ < 5. To verify that 
h(w) is squarefree, it therefore suffices to check each of the 49 possible words 
w G {0, 1, 2, 3} 5 to ensure that h(w) is squarefree in each case. 

Case 2: \y\ > 20. First, we establish the following result. 

Lemma 2 (a) Suppose h(ab) = th(c)u for some letters a,b,c G {0, 1, 2, 3} 
and strings t,u G {0,1,2,3}*. Then this inclusion is trivial (that is, 
t = e or u = e) or u is not a prefix of h(d) for any d G {0, 1, 2, 3}. 

(b) Suppose there exist letters a,b,c and strings s,t,u,v such that h(a) = 
st, h(b) = uv, and h(c) = sv. Then either a = c or b = c. 

Proof. 

(a) This can be verified with a short computation. In fact, the only a, b, c 
for which the equality h(ab) = th(c)u holds nontrivially is h(31) = 
th(2)u, and in this case t = 020301, u = 0102, so u is not a prefix of 
any h(d). 

(b) This can also be verified with a short computation. If \s\ > 6, then no 
two distinct letters share a prefix of length 6. If \s\ < 5, then \t\ > 5, 
and no two distinct letters share a suffix of length 5. 

■ 

For i — 1, 2, . . . , n define = h(a,i). Then if h(w) = xyyz, we can write 
h(w) = A,A 2 ■ ■ ■ A n = A\A'[A 2 ■ ■ ■ Aj^A'jA'jAj+t ■ ■ ■ A n „ x A n A" n 
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where 
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where \A'(\, \A!j\ > 0. See Figure □ 
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Figure 1: The string xyyz within h(w) 



If | A'[ | > \A"\, then Aj + i = h(aj + i) is a subword of A'(A 2 , hence a subword 
of A\A 2 = h(a 1 a 2 ). Thus we can write Aj +2 = A'- +2 A'- +2 with 



See Figure El 
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Figure 2: The case \A\\ > \A'j\ 

But then, by Lemma El (a), either \A"\ = 0, or \A"\ = \A"\, or A' 3+2 is a 
not a prefix of any h(d). All three conclusions are impossible. 

If \A'{\ < \A"-\, then A 2 = h(a 2 ) is a subword of A'jAj+%, hence a subword 
of AjAj + i = h(ajdj + i). Thus we can write A 3 = A' 3 A' 3 ' with 

A'{A 2 A' 3 = A'jAj+L 

See Figure El 
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Figure 3: The case \A![\ < \A"\ 



By Lemma 12 (a), either \A'{\ = or \A'(\ = \A"\ or A' 3 is not a prefix of 
any h(d). Again, all three conclusions are impossible. 

Therefore \A'(\ = \A'!\. Hence A'{ = A), A 2 = A j+1 , . . ., A } -\ = A n _i, 
and A'j = A' n . Since h is injective, we have a 2 = Oj+i, . . . , a.j_i = a n _i- 
It also follows that \y\ is divisible by 10 and A i = A'jA'j = A' n A". But 
by Lemma 121 (b), either (1) a,j = a n or (2) In the first case, 

tt2 • • • Cbj-ia>j = aj+i ■ ■ ■ a n -ia n , so w contains the square (a 2 ■ ■ ■ aj-iaj) 2 , a 
contradiction. In the second case, a\ ■ • ■ dj-i = djdj+i ■ ■ ■ a„,_i, so w contains 
the square (a\ ■ ■ ■ aj_i) 2 , a contradiction. 

It now follows that the infinite word 

h"(0) = 03102010230203010201031023010203102010230201031023 ■ ■ ■ 
is squarefree and contains no occurrences of 12, 13, 21, 32, 231, or 10302. ■ 

Theorem 3 Let w be any infinite word satisfying the conditions of Theo- 
rem^ Define a morphism g by 




1 

2 
3 



010011 
010110 
011001 
011010 



Then g(w) is a cubefree word containing no squares xx with \x\ > 4. 

Before we begin the proof, we remark that all the words 12, 13, 21, 32, 
231, 10302 must indeed be avoided, because 



0(12) contains the squares (0110) 2 , (1100) 2 , 

g(13) contains the square (0110) 2 

#(21) contains the cube (01) 3 

#(32) contains the square (1001) 2 

#(231) contains the square (10010110) 2 

#(10302) contains the square (100100110110) 2 . 



'1001' 
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Proof. The proof parallels the proof of Theorem^ Let w = a^a-i ■ ■ ■ a n be a 
squarefree string, with no occurrences of 12, 13, 21, 32, 231, or 10302. We first 
establish that if g{w) = xyyz for some x, z G {0,1,2,3}*, y G {0,1,2,3} + , 
then \y\ < 3. Without loss of generality, assume w is a shortest such string, 
so < \x\, \z\ < 6. 

Case 1: \y\ < 12. In this case we can take \w\ < 5. To verify that g(w) 
contains no squares yy with \y\ > 4, it suffices to check each of the 41 possible 
words w G {0,1,2,3} 5 . 

Case 2: \y\ > 12. First, we establish the analogue of Lemma El 

Lemma 4 (a) Suppose g(ab) = tg(c)u for some letters a, b, c G {0, 1, 2, 3} 
and strings t,u G {0,1,2,3}*. Then this inclusion is trivial (that is, 
t = e or u = e) or u is not a prefix of g(d) for any d G {0, 1, 2, 3}. 

(b) Suppose there exist letters a, b, c and strings s, t, u, v such that g(a) = st, 
g(b) = uv, and g(c) = sv. Then either a = c or b = c, or a = 2, b = 1, 
c = 3, s = 0110, t = 01, u = 0101, v = 10. 

Proof. 

(a) This can be verified with a short computation. The only a, b, c for 
which g(ab) = tg(c)u holds nontrivially are 

0(01) = 010 g(3) 110 
0(10) = 010(2)0011 
0(23) = 0110 0(1) 10. 

But none of 110, 0011, 10 are prefixes of any g(d). 

(b) If \s\ > 5 then no two distinct letters share a prefix of length 5. If 
\s\ < 3 then \t\ > 3, and no two distinct letters share a suffix of length 
3. Hence \s\ = 4, \t\ = 2. But only 0(2) and 0(3) share a prefix of 
length 4, and only 0(1) and 0(3) share a suffix of length 2. 

■ 

The rest of the proof is exactly parallel to the proof of Theorem ^ with 
the following exception. When we get to the final case, where \y\ is divisible 
by 6, we can use Lemma |U to rule out every case except where x = 0101, 
z = 01, a\ = 1, aj = 3, and a n = 2. Thus w = Ia3a2 for some string 
a G {0, 1, 2, 3}*. This special case is ruled out by the following lemma: 
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Lemma 5 Suppose a £ {0,1,2,3}*, and let w = Ia3a2. Then either w 
contains a square, or w contains an occurrence of one of the subwords 12, 
13, 21, 32, 231, or 10302. 

Proof. This can be verified by checking (a) all strings w with \w\ < 4, and 
(b) all strings of the form w = abcw'de, where a,b,c,d,e £ {0,1,2,3} and 
w' £ {0, 1, 2, 3}*. (Here w' may be treated as an indeterminate.) ■ 

It now remains to show that if w is squarefree and contains no occurrence 
of 12, 13, 21, 32, 231, or 10302, then g(w) is cubefree. If g{w) contains a cube 
yyy, then it contains a square yy, and from what precedes we know \y\ < 3. 
It therefore suffices to show that g{w) contains no occurence of 3 , l 3 , (01) 3 , 
(10) 3 , (001) 3 , (010) 3 , (Oil) 3 , (100) 3 , (101) 3 , (110) 3 . The longest such string 
is of length 9, so it suffices to examine the 16 possibilities for g(w) where 
\w\ — 3. This is left to the reader. 

The proof of Theorem El is now complete. ■ 

Corollary 6 If g and h are defined as above, then 

g(h"(0)) = 010011011010010110010011011001010011010110010011011001011010- ■ ■ 

is cubefree, and avoids all squares xx with \x\ > 4. 

3 The constant 4 is best possible 

It is natural to wonder if the constant 4 in Corollary El can be improved. It 
cannot, as the following theorem shows. 

Theorem 7 Every binary word of length > 30 contains a cube or a square 
xx with \x\ > 3. 

Proof. This may be proved purely mechanically. More generally, let 
P C S* be a set of subwords to be avoided. We create and traverse a certain 
tree T, as follows. The root of the tree is labeled e. If a node is labeled x and 
contains no subword in P, then it has children labeled xa for each a £ E; 
otherwise it is a leaf of T. This tree is infinite if and only if there is an infinite 
word avoiding the elements of P. 

If T is finite, then the height of T gives the length / such that every word 
of length I or greater contains an element of P. The tree can be created and 
traversed using a queue and breadth-first search. 

If the set P is symmetric under renaming of the letters — as it is in this 
case — we may further improve the procedure by labeling the root with any 
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particular letter, say 0. When we run this procedure on the statement of 
the theorem, we obtain a tree with 289 leaves, the longest being of length 
30. The unique string of length 29 starting with and avoiding cubes and 
squares xx with |x| > 3 is 00110010100110101100101001100. ■ 

4 Overlap-free words contain arbitrarily long 
squares 

It is also natural to wonder if a result like Corollary El holds if "cubefree" is 
replaced with "overlap-free". It does not, as the following result shows. 

Theorem 8 Any infinite overlap-free word over {0, 1} contains arbitrarily 
long squares. 

Proof. By Lemma 3] we know that if x is an overlap-free infinite word 
over {0, 1}, then there exist a word u 6 {e, 0, 1,00, 11} and an overlap-free 
infinite word y such that x = uix(y), where fx is the Thue- Morse morphism. 
By iterating this theorem, we get that every overlap-free infinite word must 
contain fi n (0) for arbitrarily large n; hence contains arbitrarily long squares. 
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